Skip to content

feat: make auto-recall query length limit configurable (autoRecallMaxQueryLength)#374

Open
jlin53882 wants to merge 1 commit intoCortexReach:masterfrom
jlin53882:feat/configurable-auto-recall-query-length
Open

feat: make auto-recall query length limit configurable (autoRecallMaxQueryLength)#374
jlin53882 wants to merge 1 commit intoCortexReach:masterfrom
jlin53882:feat/configurable-auto-recall-query-length

Conversation

@jlin53882
Copy link
Copy Markdown
Contributor

Summary

Make the auto-recall query length limit configurable via config.autoRecallMaxQueryLength, defaulting to 2000 chars.

Problem

When auto-recall is enabled, the full event.prompt (which may include long file attachment descriptions, conversation metadata, or multi-turn context) is passed directly as the embedding query. This can dilute the semantic signal and produce lower-quality retrieval results.

Solution

Add FR-04 truncation logic before embedding, configurable via config.autoRecallMaxQueryLength:

const MAX_RECALL_QUERY_LENGTH = config.autoRecallMaxQueryLength ?? 2_000;
let recallQuery = event.prompt;
if (recallQuery.length > MAX_RECALL_QUERY_LENGTH) {
  const originalLength = recallQuery.length;
  recallQuery = recallQuery.slice(0, MAX_RECALL_QUERY_LENGTH);
  api.logger.info(
    `memory-lancedb-pro: auto-recall query truncated from ${originalLength} to ${MAX_RECALL_QUERY_LENGTH} chars`
  );
}

Design notes

  • Mirrors existing pattern: autoRecallMaxItems, autoRecallMaxChars, autoRecallPerItemMaxChars all follow the same config.X ?? default pattern
  • Default of 2000: Allows richer queries than the 1000-char limit some users had patched locally
  • Log visibility: When truncation occurs, an info-level log is emitted so operators can observe behavior
  • Consistent with FR-04 intent: "Auto-recall only needs the user's intent, not full attachment text"

Related

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c0ab0ba0ca

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


// FR-04: Truncate long prompts (e.g. file attachments) before embedding.
// Auto-recall only needs the user's intent, not full attachment text.
const MAX_RECALL_QUERY_LENGTH = config.autoRecallMaxQueryLength ?? 2_000;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wire new query-length setting into parsed config

config.autoRecallMaxQueryLength is read here, but parsePluginConfig never copies cfg.autoRecallMaxQueryLength into the returned PluginConfig, so this value is always undefined at runtime and the code always falls back to 2_000. In practice, users cannot actually configure the limit despite the commit message and inline docs saying they can.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

@AliceLJY AliceLJY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — changes are clean, on-topic, and well-tested. Approving.

@jlin53882
Copy link
Copy Markdown
Contributor Author

Conflict Analysis

This PR has a base that's 7 commits behind master. During that gap, the index.ts auto-recall block was heavily refactored:

  • recallMode ("full"/"summary"/"adaptive") was added
  • autoRecallExcludeAgents exclusion logic was added
  • The entire before_prompt_build hook structure changed significantly

The MAX_RECALL_QUERY_LENGTH truncation logic in this PR targets the old architecture. After a rebase onto current master, it would need to be re-applied at the correct location in the new structure.

Proposed fix approach:

  1. Rebase this PR onto current master (7fe2ae0)
  2. Re-apply the truncation logic in the new before_prompt_build hook — likely after accessibleScopes is set and before retrieveWithRetry is called
  3. Add autoRecallMaxQueryLength to PluginConfig interface and parsePluginConfig (similar to how autoRecallMaxChars is handled)
  4. Add autoRecallMaxQueryLength to openclaw.plugin.json schema

This would be a straightforward rebase + re-apply.

@xiaoyuervae would you like help with the rebase? I can handle it on this PR if you'd like.

@AliceLJY
Copy link
Copy Markdown
Collaborator

Hey @jlin53882, thanks for this — truncating long prompts before embedding is a solid improvement for auto-recall quality.

Your conflict analysis is spot-on. The main blocker right now is the merge conflict with master (the before_prompt_build hook was refactored significantly). As you identified, the rebase also needs:

  • autoRecallMaxQueryLength added to the PluginConfig interface and parsePluginConfig
  • autoRecallMaxQueryLength added to the openclaw.plugin.json schema

These are needed so the config is discoverable via schema validation and any UI that reads the plugin schema.

Once you've rebased and added the config plumbing, happy to re-review! Thanks for the thorough conflict diagnosis. 🙏

…lMaxQueryLength (default 2000)

- Add FR-04 truncation before embedding to keep query focused
- Make MAX_RECALL_QUERY_LENGTH configurable via config.autoRecallMaxQueryLength
- Default to 2000 chars (previously hardcoded 1000 in some builds)
- Add info log when truncation occurs for visibility
- Mirrors existing config pattern (autoRecallMaxItems, autoRecallMaxChars, etc.)
@jlin53882 jlin53882 force-pushed the feat/configurable-auto-recall-query-length branch from c0ab0ba to 217a6ef Compare April 1, 2026 01:43
@jlin53882
Copy link
Copy Markdown
Contributor Author

Hi @AliceLJY — PR has been rebased onto latest master and all conflicts resolved. Changes:

  1. Rebased onto upstream/master (before_prompt_build hook refactor absorbed)
  2. autoRecallMaxQueryLength added to PluginConfig interface
  3. autoRecallMaxQueryLength added to parsePluginConfig (default: 2000)
  4. autoRecallMaxQueryLength added to openclaw.plugin.json schema + UI label

Merge state: ✅ CLEAN / MERGEABLE

Ready for re-review 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants